Skip to content

feat: Add support for LEFT JOIN LATERAL#21352

Open
neilconway wants to merge 5 commits intoapache:mainfrom
neilconway:neilc/feat-lateral-outer-joins
Open

feat: Add support for LEFT JOIN LATERAL#21352
neilconway wants to merge 5 commits intoapache:mainfrom
neilconway:neilc/feat-lateral-outer-joins

Conversation

@neilconway
Copy link
Copy Markdown
Contributor

@neilconway neilconway commented Apr 3, 2026

Which issue does this PR close?

Rationale for this change

This PR adds support for LEFT join semantics for lateral joins. This is a bit tricky because of how it interacts with compensation for the "count bug". This might be easiest to illustrate with an example; consider this query (Q1):

  SELECT t1.id, sub.cnt
  FROM t1 LEFT JOIN LATERAL (
      SELECT count(*) AS cnt FROM t2 WHERE t2.t1_id = t1.id
  ) sub ON sub.cnt > 0
  ORDER BY t1.id;

The initial decorrelation (Q2) is

  SELECT t1.id, sub.cnt
  FROM t1 LEFT JOIN (
      SELECT count(*) AS cnt, t2.t1_id, TRUE AS __always_true
      FROM t2 GROUP BY t2.t1_id
  ) sub ON t1.id = sub.t1_id

Ignoring the user's original ON clause for now. This initial query is wrong, because t1 rows that don't have a match in t2 will get all-NULL values, not 0 for count(*) of an empty set. This is the "count bug", and we compensate for that by checking for rows when __always_true is NULL, and replacing the agg value with the default for that agg (Q3):

  SELECT t1.id,
         CASE WHEN sub.__always_true IS NULL THEN 0
              ELSE sub.cnt END AS cnt
  FROM (  /* Q2 */  )

Now we just need to handle the user's original ON clause. We can't add this to the rewritten ON clause in Q1, because we don't want the count-bug compensation to fire. But we also can't just add it to the WHERE clause, because we need left join semantics. So we can instead wrap another CASE that re-checks the ON condition and substitutes NULL for every right-side column:

  SELECT t1.id,
         CASE WHEN (cnt > 0) IS NOT TRUE THEN NULL
              ELSE cnt END AS cnt
  FROM (  /* Q3 */  )

What changes are included in this PR?

  • Implement lateral left join rewrite as described above
  • Update expected tests and add more test cases
  • Update documentation

Are these changes tested?

Yes. All new test queries were also run against DuckDB to confirm that both systems produce the same results.

Are there any user-facing changes?

Support for a new feature.

@neilconway neilconway changed the title Neilc/feat lateral outer joins feat: Add support for LEFT JOIN LATERAL Apr 3, 2026
Copy link
Copy Markdown

@crm26 crm26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the approach. The three-way ON clause handling (Inner → post-join filter, LEFT without scalar agg → merge into join ON, LEFT with scalar agg → CASE WHEN nullification) is well-reasoned and avoids the count-bug compensation conflict.

We have 8 production views that use LATERAL JOIN for date range expansion (generate_series patterns) that currently require manual refactoring for DataFusion. This PR unblocks them directly.

Tested the existing basic LATERAL support — clean. Looking forward to LEFT JOIN LATERAL landing.

Copy link
Copy Markdown

@crm26 crm26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up: validated locally beyond the initial review.

  • Full optimizer test suite (656 tests): all pass, no regressions
  • All sqllogictest files (lateral_join.slt, joins.slt): pass
  • Clippy clean
  • Code reviewed: three-way ON clause handling is correct — INNER → post-filter, LEFT without scalar agg → merged ON, LEFT with scalar agg → CASE WHEN nullification
  • Edge cases verified: empty right side, NULL join keys, nested laterals, non-trivial ON with COUNT, multi-scope correlation guard

Clean merge against current main (8 commits behind, no conflicts in optimizer or lateral code).

@neilconway
Copy link
Copy Markdown
Contributor Author

@crm26 Thanks for the review! And I'm glad to hear that you'll find lateral joins to be helpful. If you have more feedback on the feature in the future, please share it!

@mbutrovich mbutrovich self-requested a review April 10, 2026 16:17
@mbutrovich
Copy link
Copy Markdown
Contributor

I'm going to review this further, but one quick suggestion for a regression test: DuckDB recently shipped a bug (duckdb/duckdb#21609) where a WHERE l.k IS NOT NULL filter failed to eliminate NULL rows produced by LEFT JOIN LATERAL. It was fixed in 1.5.1, but it shows this is an easy edge case to get wrong. None of the current tests exercise interaction between a post-join WHERE filter and left-preserved NULL rows. Something like:

-- Outer WHERE filter must eliminate NULL-producing LEFT lateral rows
SELECT r.k AS rk, l.k AS lk
FROM src_r AS r
LEFT JOIN LATERAL (
    SELECT l.k FROM src_l AS l WHERE l.k > r.k
) AS l ON TRUE
WHERE l.k IS NOT NULL;

where src_r and src_l both contain '2001-01-01' -- the lateral produces no match, LEFT JOIN fills NULLs, and the WHERE should return 0 rows. I verified this returns the correct result (0 rows) on this branch.

What I see so far looks great, but I'll keep digging. Thanks @neilconway!

@github-actions github-actions bot added documentation Improvements or additions to documentation optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels Apr 11, 2026
@neilconway
Copy link
Copy Markdown
Contributor Author

@mbutrovich Thanks, that's a great idea! I added a test for this case.

@neilconway
Copy link
Copy Markdown
Contributor Author

I added a bunch more test cases, covering other edge cases of this feature. I also ran all the tests against DuckDB to confirm that the queries produce the same results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support LEFT JOIN LATERAL

3 participants